AITopics

Genre: Research Report > Promising Solution (0.54)

Industry:

Leisure & Entertainment (0.68)
Education > Curriculum > Subject-Specific Education (0.62)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.34)

Neural Information Processing SystemsFeb-19-2026, 06:53:25 GMT

80d46bb66ea003f4b29fa6013905d50a-Paper-Conference.pdf

dependency, information, representation, (16 more...)

Country:

Asia > China > Beijing > Beijing (0.05)
Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > California > Los Angeles County (0.04)
(2 more...)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
(3 more...)

Neural Information Processing SystemsFeb-12-2026, 13:27:21 GMT

4a70f0c8443593ca59a88ebc8a937ed6-Supplemental-Datasets_and_Benchmarks_Track.pdf

accuracy, annotator, tallyqa, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceNov-26-2025

Interactive AI NPCs Powered by LLMs: Technical Report for the CPDC Challenge 2025

Huang, Yitian, Lei, Yuxuan, Lian, Jianxun, Liao, Hao

This report presents the solution and results of our team MSRA\_SC in the Commonsense Persona-Grounded Dialogue Challenge (CPDC 2025). We propose a simple yet effective framework that unifies improvements across both GPU Track and API Track. Our method centers on two key components. First, Context Engineering applies dynamic tool pruning and persona clipping for input compression, combined with post-processing techniques such as parameter normalization and function merging. Together with manually refined prompts, this design improves tool call stability, execution reliability, and role-playing guidance. Second, in the GPU Track, we further adopt GRPO training, replacing supervised fine-tuning with reinforcement learning directly optimized by reward signals. This mitigates small-sample overfitting and significantly enhances task-oriented dialogue performance. In the final evaluation, our team ranks 1st in Task 2 API, 2nd in Task 1 API, and 3rd in both Task 3 API and GPU track, demonstrating the effectiveness of our approach. Our code is publicly available at https://gitlab.aicrowd.com/nikoo_yu/cpdc-2025-winning-solution

arxiv preprint arxiv, large language model, machine learning, (17 more...)

2511.202

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Javahar, Jeena, Budhrani, Tanya, Basha, Manaal, de Souza, Cleidson R. B., Beschastnikh, Ivan, Rodriguez-Perez, Gema

Cracking CodeWhisperer: Analyzing Developers' Interactions and Patterns During Programming Tasks

arXiv.org Artificial IntelligenceOct-14-2025

Abstract--The use of AI code-generation tools is becoming increasingly common, making it important to understand how software developers are adopting these tools. In this study, we investigate how developers engage with Amazon's Code-Whisperer, an LLM-based code-generation tool. We conducted two user studies with two groups of 10 participants each, interacting with CodeWhisperer - the first to understand which interactions were critical to capture and the second to collect low-level interaction data using a custom telemetry plugin. Our mixed-methods analysis identified four behavioral patterns: 1) incremental code refinement, 2) explicit instruction using natural language comments, 3) baseline structuring with model suggestions, and 4) integrative use with external sources. We provide a comprehensive analysis of these patterns . Several IDE-based code generation tools have been released in the past few years, such as GitHub's Copilot [8], Kite [14], Amazon's Code Whisperer [20], Tabnine [22], and WPCode [28]. Research reveals that being able to achieve their full potential requires a certain level of guidance to ensure that the tool's output aligns with the user's goal [21].

codewhisperer, large language model, natural language, (17 more...)

2510.11516

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.68)
Information Technology (0.48)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)

Neural Information Processing SystemsOct-10-2025, 01:29:34 GMT

4a70f0c8443593ca59a88ebc8a937ed6-Supplemental-Datasets_and_Benchmarks_Track.pdf

accuracy, annotator, tallyqa, (17 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsOct-8-2025, 23:43:01 GMT

801750bc49fdc3d498e9ee63479f315e-Paper-Conference.pdf

machine learning, natural language, segmentation, (15 more...)

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.68)

Industry: Appliances & Durable Goods (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceSep-30-2025

Model Fusion with Multi-LoRA Inference for Tool-Enhanced Game Dialogue Agents

Wang, Kangxu, Chen, Ze, Wei, Chengcheng, Zheng, Jiewen, He, Jiarong, Gao, Max

This paper presents the opdainlp team's solution for the GPU track of the CPDC 2025 challenge. The challenge consists of three tasks, aiming to build an in-game conversational AI that adheres to character personas, aligns with the game's worldview, and supports function calling. Considering both effectiveness and resource/time constraints during inference, we synthesized data for some of the tasks based on the datasets provided by the competition organizers. We employed Qwen3-14B with LoRA fine-tuning and model fusion, and utilized a base model integrated with multiple LoRA adapters during inference. Specifically, in the competition, we used three distinct LoRA adapters to handle tool calling, response generation with tool call results, and response generation without tool call results, respectively. MultiLoRA inference was implemented using vLLM. Our solution achieved the first place in Task 1 and Task 3, and the second place in Task 2 of the GPU track.

large language model, machine learning, natural language, (22 more...)

2509.24229

Country: Europe > Austria (0.28)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Marmol-Romero, Alba Maria, Jimenez-Zafra, Salud Maria, Plaza-del-Arco, Flor Miriam, Molina-Gonzalez, M. Dolores, Martin-Valdivia, Maria-Teresa, Montejo-Raez, Arturo

SINAI at eRisk@CLEF 2022: Approaching Early Detection of Gambling and Eating Disorders with Natural Language Processing

arXiv.org Artificial IntelligenceSep-19-2025

This paper describes the participation of the SINAI team in the eRisk@CLEF lab. Specifically, two of the proposed tasks have been addressed: i) Task 1 on the early detection of signs of pathological gambling, and ii) Task 3 on measuring the severity of the signs of eating disorders. The approach presented in Task 1 is based on the use of sentence embeddings from Transformers with features related to volumetry, lexical diversity, complexity metrics, and emotion-related scores, while the approach for Task 3 is based on text similarity estimation using contextualized word embeddings from Transformers. In Task 1, our team has been ranked in second position, with an F1 score of 0.808, out of 41 participant submissions. In Task 3, our team also placed second out of a total of 3 participating teams.

artificial intelligence, machine learning, natural language, (18 more...)

2509.14806

Country: Europe > Spain (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Ryan, Yuriel, Tan, Rui Yang, Choo, Kenny Tsu Wei, Lee, Roy Ka-Wei

Humor in Pixels: Benchmarking Large Multimodal Models Understanding of Online Comics

arXiv.org Artificial IntelligenceSep-18-2025

Understanding humor is a core aspect of social intelligence, yet it remains a significant challenge for Large Multimodal Models (LMMs). We introduce PixelHumor, a benchmark dataset of 2,800 annotated multi-panel comics designed to evaluate LMMs' ability to interpret multimodal humor and recognize narrative sequences. Experiments with state-of-the-art LMMs reveal substantial gaps: for instance, top models achieve only 61% accuracy in panel sequencing, far below human performance. This underscores critical limitations in current models' integration of visual and textual cues for coherent narrative and humor understanding. By providing a rigorous framework for evaluating multimodal contextual and narrative reasoning, PixelHumor aims to drive the development of LMMs that better engage in natural, socially aware interactions.

large language model, machine learning, natural language, (22 more...)

2509.12248

Country:

Europe (1.00)
North America > United States (0.28)
North America > Canada (0.28)
Asia > China (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
(4 more...)